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Abstract 
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Formulas  are  derived  for  the  bias  in  the  maximum  likelihood  esti¬ 
mators  ot  the  item  parameters  In  the  logistic  item  response  model  when 
examinee  abilities  are  known.  Numerical  results  are  given  for  a  typical 
verbal  test  for  college  admission. 
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Statistical  Bias  in  Maximum  Likelihood  Estimators  of  Item  Parameters*' 

This  paper  derives  formulas  for  the  statistical  bias  in  the 
maximum  likelihood  estimators  (MLE)  of  item  parameters  in  item 
response  theory  (IRT)  [Lord,  1980].  It  will  deal  only  with  the  three- 
parameter  logistic  model  for  dichotomously  scored  items.  Available 
formulas  for  the  sampling  variance  of  these  MLE  are  limited  to  the  case 
where  the  examinee  parameters  are  known;  the  present  derivations  are 
limited  to  this  case  also.  Under  the  three-parameter  logistic  model, 
the  probability  of  a  correct  answer  to  an  item  is  the  following  function  of 
examinee  ability  level  0  ; 


P  =  P(0)  z  c  + 


1  -  c 


1  +  e 


-Ae-B 


=  1  - 


1  -  c 


1  +  e 


Ae+B 


(1) 


where  A  ,  B  ,  and  c  are  parameters  describing  the  item. 

In  practical  work,  A  ,  the  MLE  of  A  ,  sometimes  tends  to  become 
infinite.  This  suggests  a  positive  bias,  at  least  in  some  data  for 
certain  items.  Is  it  possible  to  correct  for  this  sometimes 
substantial  bias  in  A  ?  Practical  experience  also  suggests  that 

*This  work  was  supported  in  part  by  contract  N00014-80-C-0402 , 
project  designation  NR  150-453  between  the  Office  of  Naval  Research  and 
Educ.itional  Testing  Service.  Reproduction  in  whole  or  in  part  is 
permitted  for  any  purpose  of  the  United  States  Government. 
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when  B  is  large  and  positive,  B  tends  to  be  positively  biased;  when 
B  is  large  and  negative,  B  tends  to  be  negatively  biased.  Practical 
experience  has  not  provided  any  clear  indications  as  to  the  bias  in  c  . 
Since  c  values  are  most  often  less  than  the  reciprocal  of  the  number 
of  choices  in  a  multipie-choice  item,  it  is  of  interest  to  determine 
whether  this  apparent  anomaly  could  be  due  to  a  substantial  negative 
bias  in  c  . 

The  method  to  be  useo  liere  to  derive  formulas  for  the  bias  i’’. 

estimated  item  parameters  is  tlie  same  method  described  in  Lord  [Note  1]. 

The  rccider  is  referred  there  for  a  more  detailed  discussion.  The 

following  derivation  deals  with  a  single  fixed  item  administered 

to  N  examinees  with  known  ability  levels  6,,6„,...,6 

1  /  N 


1.  [.Ikclihood  Equations 

Let  (  denote  either  A  ,  B  ,  or  c  .  We  assume,  as  in  Lord 
[Note  1]  that  A  ,  B  ,  and  ^  are  bounded  and  that  c  is  bounded 

away  !rom  1.  Inder  these  conditions,  <  is  a  consistent  estimator  of 
I  and  >N  (  i  -  0  is  as\-mptot  ica lly  normally  distributed  with  mean  zero 
and  finite  variance.  It  follows  that  S(i  -  a)®  is  at  most  of  order 


N 


(I 

'a 


i.et  u 

a 

on  the  g 


=  0  i:ir  1  denote  the  score 
iven  (dichotomously  scored) 
For  a  =  I , 2, . . . ,N  and 


of  examinee  a  (  a  =  1,2,..., 

item.  Write  P  =  P(  )  and 
a  a 

<  =  A,B,c  ,  write 
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p  I  ^  —  _ a 

Qi  ~  3a 


(2) 


a^a 


(3) 


(u 


P  )R'^ 
a  a 


(A) 


N 


(5) 


The  likelihood  equations  are  now 

L  =  0  (  a  =  A,B,c  )  (6) 

a 

where  the  caret  denotes  substitution  of  MLE  for  A  ,  B  ,  and  c  . 

2.  Taylor  Series 

Let  the  symbols  £  (  =  A,B,c  )  and  5  (  =  A,B,c  )  have  the  same 

meaning  as  a  ,  so  that  Z  denotes  a  three-term  sum  with  6  taking 

£ 

on  the  values  A  ,  B  ,  and  c  .  Expanding  (6)  in  a  three-variable 
Taylor  series  and  dividing  by  N  ,  we  have 


0  = 


) 


I 

N 


,i6 


+  Z  I  (£  -  S)(6  -  +  . . .  ] 

-  ,  r  UD* 


(  a  =  A,  B,  c  ) 


(7) 


where 
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=  3  r^/3  6 

a6  01 


(8) 


■■ass  = 


For  simplicity,  write  v,  w,  x,  y,  z,or  Z  instead  of 
6-3  or  6  -  6  .  It  will  not  be  confusing  to  replace  subscripts 
a,  6,5  by  v,  w,  x,  y,  or  z.  The  Taylor  series  is  now 


o  =  -i:  r+Eyr  +  ^  i  i  yzr  + 
N  ‘  X  xy  2  „  ,  xyz 

a  y  y  z 


.] 


(9) 


Define 


Y®  =  Sr* 


a  _  „a  a 
e  =  r  -  Y 


(10) 


It  can  be  seen  from  (4)  that 


Yx  *  0  (  X  =  A,B,c  ) 


(11) 


Equation  (9)  can  now  be  written 


^  I  [e^  +  E  yy^  +  E  ye®  +  4  E  E  yz(Y^  +  )+•■•] 

N.x  yXy  yXy  2  ^  xyz  xyz 


y  z 


xy2 


xyz 


Let 


\  =  5  '  I'x 

a 


e  E  4  5:  E® 

X  N  X 
a 


Ira 

xy  N  ^  xy 


(13) 
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and  so  forth.  The  Taylor  series  is  now 
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0=E  +  ZyY  +Iye  yz(Y  +  £  )  + 

X  y  xy  y  ^  y  z 


(14) 


Rewrite  this  in  matrix  notation. 


ry  =  -c 

-  Ey  -  My 

~  Hy  -  ... 

(15) 

where 

r  5 

.  y  E  {A  - 

A,  B  -  B,  c  - 

'  C}'  .  C  E 

* 

E  =  11c 

M  =  II 

I  - 
2 

,  and  H  E 

II  T  4-  2c  l|  . 

‘ '  2  XV2  ‘  ‘ 

z 

Premultiply 

(15)  by 

r-i 

to  obtain  finally 


y  -  -r  -  r“^Ey  -  r“^My  -  r'^Hy  -  ...  .  (16) 

The  expectation  of  (16)  gives  the  bias  of  the  vector  y  of 
maximum  likelihood  estimators.  First  we  will  need  to  eliminate  y 
from  the  right  side  of  (16). 

3.  Solving  for  y 

Premultiply  (16)  by  r  to  obtain 

^'^lly  =  -  ...  .  (17) 

In  Sei  Lion  j  it  will  bect'me  clear  that  the  higher-order  terms  in 
(17)  can  be  neglected.  F.quatlon  (17)  allows  us  to  eliminate  y  from 
the  last  term  in  (16). 


xy 
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Similarly,  to  evaluate  the  next-to-last  term  in  (16),  premultiply 
(16)  by  r  ,  obtaining 


-1  -1  -1 
r  Mv  =  -r  MT  e 


(18) 


Likewise,  premultiply  (16)  by  P  E  to  obtain 


r  ^Ev  =  -r“^Er~\  -  . . . 


(19) 


We  eliminate  y  from  the  right-hand  side  of  (16)  by  sub¬ 
stituting  (17)  -  (19)  into  (16): 


y  =  +  r~^(E  +  M  +  n)r~^r,  +  ... 


In  scalar  notation,  this  is 


r-  vx  .  _  „  _  vv.  ,  ^  ,  ..  wx 

V  =  -  „  Y'  C  +  I  1  Y'  (f  fll  +  h  )y  £  +  ,  .  . 

X  W  VW  VW  X 

X  V  W  X 


(20) 


where  I;  Y 


'  .^yX|  I  ,  i;  _  -  1 


z  i  1  ,  ,  m  =  z-  1  yy  ,  and  h  =  -z  lyc 

vx  VW  2  ^  '  wy  VW  2  ^  vwy 

y  ■  2  y 

To  evaluate  t.;  ,  multiply  (20)  bv  —  y  and  sum  over  v 

yvi  ■  'I  vw\' 


to  obtain 


1  vx 

'T  '■  ^  Y _ Y’ 


V  X 


VWV  X 


(21) 


Eor  h  ,  similarly 

VW 


-  1  -  -  a. 

v’w  I  vwy  X 

V  X 


(22) 
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Substituting  (21)  and  (22)  into  (20),  we  have 


y  =  -  I  +111  /''e  -UziZl 

^  '  X  VW‘  X  2  „  vwz'  Z  X 

X  vwx  vwxzZ 


V  W  X  E  Z 


(23) 


The  bias  of  A  ,  B  ,  or  c  is  found  by  taking  the  expectation  of 
(23).  First  we  need  formulas  for  various  derivatives  that  appear  in  (23). 


4.  Derivatives 


From  (8)  and  (4) 


=  u  -  P  R’’o 

a6  a  aS  6a  a  a6 


(24) 


where  R''  denotes  a  second  derivative.  From  (24) 
a6 


U  R"'^  -  P''!r'^  -  P;^R"!  -  Pr^"s  ■  P 

aSS  a  a65  Bo  a  6  a6  5  a8  a  aB6 


(25) 


To  evaluate  (24)  and  (25),  various  derivatives  of  (1)  are  required. 
Dropping  the  affix  a  ,  we  find 


p'  = _ f _  =  _ H _ 

c  A0+B  1  -  c 

1  +  e 


>'  =  Qi.P.  ~  =  (p  _  ^.)p* 

B  1  -  c  ^  ^  c 


(26) 


(27) 
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p;  =  6p;  . 


P"  =  0 

CC 


Pi' 


-"b 


Be  1  -  c 


^B 


BB  1  -  c 


(1  -  2P  +  c) 


P' 

A 


Ac  1  -  c 


=  '^BB  * 


p*'  s  Ap** 

AA  '  AB 


5.  Expectations 


Since  5u  =  P  , 


y  =  S~  Z  =  0 

X  N  X 

a 


(2 


(2 


C 


C 


C 


c 


From  (24)  -  (25) 
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<  =  -  1  E  P'^R'^  = 

xy  N  y  X 
a 


p.ap.a 

x  y 

P  Q 
a  a 


(36) 


1 


xyz 


„  _  (P"^R'^  +  P'^R"^  +  P'^R"^) 
N  yz  X  y  xz  z  xv 
a 


1  Q  -  P 

=  i  E  [2P'®P*^P’^  / 

N  xyz  p2  2 

a  a 


z  JSZ  _  X  yz 


P  Q 

a  a 


a  a 


p.ap„a 

V  xz 

P  Q 

a  a 


(37) 


Wri ting 


u  -  P 
t  =  -a - a 

a  ■  P.Q. 

a  a 


we  now  have 


£  =  —  E  t  P 

X  N  ax 
a 


(38) 


e  ^  H  (u  -  P  )R"^  =  4  S  t^P"!*  +  4  Z  P.'.^P.’.^  (  ^ 


'xv  N  '  a 
'  a 


a  xy  N  ^  a  xy  ^  a  ^  ^  ^a^a 


0 

t") 

a 


To  evaluate  (23)  we  need 


(39) 


=  0 


5e  r 
X  z 


E  j;  P'^P*’’  Cov(t  ,t  ) 
N  a  b 


-4  2  P'^P'^/P  Q 
.2  X  z  a  a 
N  a 


(40) 


1 

N  ^xz 


(41) 
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This  last  result  is  obtained  from  (38),  using  the  fact  that  Cov(t  ,t  ) 

SL  D 

is  zero  when  a  b  and  that  Var  t  =  1/P  Q  .  Similarly,  from  (38) 
and  (39),  we  find  that 


Se 


t 

VW  X 


^  I  E  P"  Cov(t  ,t.  ) - ^  E  E  P'  P'  P’^  Cov(t  ,t?) 

2  .  VW  X  a  b  ,,2  ,  v  w  x  a  b 

Nab  Nab 


=  ^E  ( 
N  a 


VW  X 

P  Q 
a  a 


P  -  Q 

+  P'^P’^P'^  — - - 

V  w  X  p2jj2 

a  a 


(42) 


Also,  from  (25)  and  (37), 


Sc  e_e 
vwz  Z  X 


=  ^  I  2  s  (u 
N  a  b  c  ^ 


^  )R'"  (u,  - 
a  vwz  b 


"b> 


,.b 


(Uc- 


P*^ 

X 

P  Q 
c  c 


(43) 


Since  S(u  -  P  )  (u  -  P,  )(u  -  P  )  vanishes  un].ess  a  “  b  =  c  , 

a  a  b  b  c  c 

2 

(43)  is  of  order  1/N  and  can  be  neglected  in  evaluating  the  expecta¬ 
tion  of  (23).  The  order  of  magnitude  of  other  terms  neglected  in 
preceding  sections  can  be  found  by  the  same  method. 
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6.  Bias  in  Item  Parameter  MLE 

The  bias  in  the  MLE  of  an  item  parameter  is  now  found  by  writing 
down  the  expectation  of  (23),  dropping  the  last  term,  and  evaluating 
the  remaining  terms  on  the  right  by  (40),  (41),  and  (42).  The  resulting 
formula  for  the  bias,  accurate  through  terms  of  order  1/N  ,  is 


.  _  „  _  yv  wx^  wx^  zZ 

y  =  IZZY  Y  oe  e  +  ^  1  I  I  lY  y  Y  Y., 

vw  X  2N  vwz  _  x/ 

V  w  X  V  V  X  z  Z 


Z  I  S  y'^^^C  t  + 

V  W  X  ^ 


1  yv  ^  „  wx 
Z  Z  Z  y  y 

2N  vwx 

V  w  X 


(44) 


(since  the  sum  over  Z  equals  1  when  z  =  x  and  vanishes  otherwise).  The 
terms  on  the  right  are  evaluated  using  (36),  (37),  and  (42).  The  y  on  the 
left  side  of  (44)  is  either  A-A,  B-B,or  c-c.  The  affixes  on 
the  right  side  denote  either  A  ,  B  ,  or  c  :  y'^'^II  denotes  the 

inverse  of  11  v  jl  . 

II  yy  I  I 


7.  Reparameterization 


The  preceding  sections  derive  the  bias  of  A  and  B  (for 
convenience),  whereas  the  item  parameters  commonly  used  are  a  =  A/1.7 
and  b  =  -B/1.7a  .  The  bias  of  a  is  clearly  equal  to  the  bias  of  A 
divided  by  1.7.  The  bias  of  b  may  be  found  as  follows.  Cov(a,b)  S 
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5(a  -  a)(b  -  b)  r  i}(ab  -  ab)  -  bS(a  -  a)  -  a^(b  -  b)  .  Rearranging  this 
Idem  i  tv,  we  find  the  bias  of  b  : 

s(b  -  b)  E  7  1-S(B  -  B)  -  bs (A  -  A)  -  1 . 7  Cov(a,b)]  .  (45) 

A 

The  required  covariance  on  the  riglit  is  obtained  in  the  usual  way, 
by  inverting  the  information  matrix  [Lord,  1980,  p.  191). 

_ Numerical  Example 

Figures  1,  2,  4,  ti  show  the  bias  in  b  ,  a  ,  and  c  for  a 
set  of  90  items  selected  to  represent  very  roughly  a  typical  verbal 
test  tor  college  admissions.  This  is  artificial  data,  thus  the  true 
parameters  are  knowTi.  The  number  of  examinees  used  to  estimate  the 
item  parameters  is  2995. 

Because  of  the  limitations  of  three-dimensional  plotting,  Figure 
I  siiows  only  those  items  for  which  b  is  positively  biased;  Figure  2 
shows  the  remaining  items.  Easy  and  medium-dif f iulty  items  are 
negatively  biased;  only  difficult  items  are  positively  biased.  Items 
with  b  =  1.5  to  1.8  have  near-zero  bias.  For  five  items  the  bias  is 
so  large  that  it  runs  off  the  plot.  The  item  parameters  and  biases 
for  these  five  items  are  as  follows: 
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a 

b 

c 

g(b-b) 

i (a-a) 

g(c-c) 

44 

-4.4 

.01 

-.37 

.009 

-.43 

43 

-3.1 

.10 

-.11 

.006 

-.08 

41 

-2.2 

.15 

-.08 

.  305 

-.04 

37 

-1.8 

.01 

-.07 

.OO,'’ 

-.03 

41 

-0.4 

.17 

-.05 

.004 

-.01 

The  first  two  of  these  five  items  do  not  appear  in  the  plots  at  all 
because  the  b  values  lie  outside  the  plotted  range.  We  see  that 
low  discriminating  power  and  low  difficulty  (high  easiness)  give 
rise  to  large  estimation  errors  for  b  ,  as  might  be  expected. 

Figures  3,  5,  7  show  the  standard  errors  of  b  ,  a  ,  and  c 
lor  comparison.  For  clarity  the  a  and  b  scales  are  oriented  one 
way  in  Figures  1-3  and  6-7,  the  opposite  way  in  Figures  4-5.  Note 
that  the  vertical  scales  vary  from  figure  to  figure. 

The  bias  in  a  is  positive  for  all  items,  the  bias  in  c  is 
negative  for  all  items.  In  general,  an  estimate  that  has  a  large 
standard  error  tends  to  have  a  numerically  large  bias. 

For  very  easy  items,  b  and  c  have  numerically  large  biases  and 
large  standard  errors.  For  hard  items,  c  has  a  numerically  small  bias 
and  small  standard  error,  a  has  a  large  bias  and  large  standard  error. 
The  bias  and  standard  error  of  b  both  increase  for  very  difficult  items. 
Highly  discriminating  items  have  numerically  small  bias  and  small 
standard  error  for  b  and  c  .  Poorly  discriminating  items  tend  to  have 
low  bias  and  low  standard  error  for  a  . 

The  plots  show  the  relation  of  bias  (or  of  standard  error) 
to  a  and  b  .  The  relation  to  c  is  not  easily  made  graphically 
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obvious.  If  the  value  of  c  had  an  important  effect  on  the  bias  and 
standard  error  of  the  MLE,  neighboring  items  in  Figures  1-7  would 
frequently  have  quite  different  biases  or  standard  errors.  The  fact 
that  neighboring  items  typically  appear  very  similar  in  the  figures 
indicates  that  c  typically  has  a  relatively  minor  effect  on  their 
bias  and  standard  error. 

Most  typically  the  bias  of  an  MLE  is  about  one-tenth  of  its 
standard  error.  It  is  very  seldom  more  than  a  fifth  of  its 
standard  error. 

The  effect  of  the  bias  for  individual  item-parameter  estimates  is 
thus  probably  negligible.  However,  the  invariably  positive  bias  in 
Che  a  ,  for  example,  may  have  a  cumulative  effect  over  many  items 
so  that  its  effect  is  no  longer  negligible. 

It  is  just  this  type  of  effect  that  makes  the  variance  across 
examinees  of  the  MLE  of  9  a  gross  overestimate  of  the  true  variance 
of  9  across  examinees  (Lord,  Note  1].  An  unbiased  estimate  of 

o  is  derived  in  the  cited  reference.  Although  theoretically  possible 

0 

it  will  be  more  difficult  to  work  out  similarly  unbiased  estimators 
of  equatings  and  other  commonly  computed  functions  of  estimated  item 


parameters . 
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